Joint Factor Analysis for Speaker Recognition Reinterpreted as Signal Coding Using Overcomplete Dictionaries
نویسندگان
چکیده
This paper presents a reinterpretation of Joint Factor Analysis as a signal approximation methodology―based on ridge regression―using an overcomplete dictionary learned from data. A non-probabilistic perspective of the three fundamental steps in the JFA paradigm based on point estimates is provided. That is, model training, hyperparameter estimation and scoring stages are equated to signal coding, dictionary learning and similarity computation respectively. Establishing a connection between these two well-researched areas opens the doors for cross-pollination between both fields. As an example of this, we propose two novel ideas that arise naturally form the non-probabilistic perspective and result in faster hyperparameter estimation and improved scoring. Specifically, the proposed technique for hyperparameter estimation avoids the need to use explicit matrix inversions in the M-step of the ML estimation. This allows the use of faster techniques such as Gauss-Seidel or Cholesky factorizations for the computation of the posterior means of the factors x, y and z during the E-step. Regarding the scoring, a similarity measure based on a normalized inner product is proposed and shown to outperform the state-of-the-art linear scoring approach commonly used in JFA. Experimental validation of these two novel techniques is presented using closed-set identification and speaker verification experiments over the Switchboard database.
منابع مشابه
Rate-Distortion Analysis of Sparse Overcomplete Codes
Transform coding is a popular coding strategy with many desirable properties. The performance of a transform coder relies on the compaction of energy in a small number of coefficients in the transform domain. Most transform coders rely on linear transforms with orthogonal dictionaries, however, linear transforms are limited in that they often only exploit some of the signal structure. For more ...
متن کاملشبکه عصبی پیچشی با پنجرههای قابل تطبیق برای بازشناسی گفتار
Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...
متن کاملHighly overcomplete sparse coding
This paper explores sparse coding of natural images in the highly overcomplete regime. We show that as the overcompleteness ratio approaches 10x, new types of dictionary elements emerge beyond the classical Gabor function shape obtained from complete or only modestly overcomplete sparse coding. These more diverse dictionaries allow images to be approximated with lower L1 norm (for a fixed SNR),...
متن کاملSimultaneous denoising and compression of power system disturbances using sparse representation on overcomplete hybrid dictionaries
This study introduces a novel unified framework for simultaneous denoising and compression of electric power system disturbance signals using sparse signal decomposition and reconstruction on overcomplete hybrid dictionary (OHD) matrix. In the proposed method, the power quality signal is first decomposed into deterministic sinusoidal components and non-deterministic components using the OHD mat...
متن کاملLearning Overcomplete Representations
In an overcomplete basis, the number of basis vectors is greater than the dimensionality of the input, and the representation of an input is not a unique combination of basis vectors. Overcomplete representations have been advocated because they have greater robustness in the presence of noise, can be sparser, and can have greater flexibility in matching structure in the data. Overcomplete code...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010